Introduction
This vignette demonstrates the ForestSearch methodology for exploratory subgroup identification in survival analysis, as described in León et al. (2024) Statistics in Medicine .
Motivation
In clinical trials, particularly oncology, subgroup analyses are essential for:
Evaluating treatment effect consistency across patient populations
Identifying subgroups where treatment may be detrimental (harm)
Characterizing subgroups with enhanced benefit
Informing regulatory decisions and clinical practice
While prespecified subgroups provide stronger evidence, important subgroups based on patient characteristics may not be anticipated. ForestSearch provides a principled approach to exploratory subgroup identification with proper statistical inference.
Methodology Overview
ForestSearch identifies subgroups through:
Candidate factor selection : Using LASSO and/or Generalized Random Forests (GRF)
Exhaustive subgroup search : Evaluating all combinations up to maxk factors
Consistency-based selection : Applying splitting consistency criteria
Bootstrap bias correction : Adjusting for selection-induced optimism
Cross-validation : Assessing algorithm stability
The key innovation is the splitting consistency criterion : a subgroup is considered “consistent with harm” if, when randomly split 50/50 many times, both halves consistently show hazard ratios ≥ 1.0 (for example if 1.0 represents a meaningful “harm threshold”).
Setup
Load Required Packages
Show code
library (forestsearch)
library (survival)
library (data.table)
library (ggplot2)
library (gt)
library (grf)
library (policytree)
library (doFuture)
library (doRNG)
# Optional packages for enhanced output
library (patchwork)
library (weightedsurv)
# Set ggplot theme
theme_set (theme_minimal (base_size = 12 ))
Parallel Processing Configuration
ForestSearch supports parallel processing for computationally intensive operations (bootstrap, cross-validation).
Show code
# Detect available cores (use fewer for CRAN checks)
n_cores <- min (parallel:: detectCores () - 1 , 4 )
cat ("Using" , n_cores, "cores for parallel processing \n " )
Using 4 cores for parallel processing
Show code
# Configure parallel backend
registerDoFuture ()
registerDoRNG ()
Data: German Breast Cancer Study Group Trial
Study Background
The GBSG trial evaluated hormonal treatment (tamoxifen) versus chemotherapy in node-positive breast cancer patients. Key characteristics:
Sample size : N = 686
Outcome : Recurrence-free survival time
Censoring rate : ~56%
Treatment : Hormonal therapy (tamoxifen) vs. chemotherapy
Data Preparation
Show code
# Load GBSG data (included in forestsearch package)
df.analysis <- gbsg
# Prepare analysis variables
df.analysis <- within (df.analysis, {
id <- seq_len (nrow (df.analysis))
time_months <- rfstime / 30.4375
grade3 <- ifelse (grade == "3" , 1 , 0 )
treat <- hormon
})
# Define variable roles
confounders.name <- c ("age" , "meno" , "size" , "grade3" , "nodes" , "pgr" , "er" )
outcome.name <- "time_months"
event.name <- "status"
id.name <- "id"
treat.name <- "hormon"
# Display data structure
cat ("Sample size:" , nrow (df.analysis), " \n " )
Show code
cat ("Events:" , sum (df.analysis[[event.name]]),
sprintf ("(%.1f%%) \n " , 100 * mean (df.analysis[[event.name]])))
Show code
cat ("Baseline factors:" , paste (confounders.name, collapse = ", " ), " \n " )
Baseline factors: age, meno, size, grade3, nodes, pgr, er
Baseline Characteristics
Show code
create_summary_table (
data = df.analysis,
treat_var = treat.name,
table_title = "GBSG Baseline Characteristics by Treatment Arm" ,
vars_continuous = c ("age" , "nodes" , "size" , "er" , "pgr" ),
vars_categorical = c ("grade" , "meno" ),
font_size = 12
)
Characteristic
Control (n=440)
Treatment (n=246)
P-value
SMD
age
Mean (SD)
51.1 (10.0)
56.6 (9.4)
<0.001
0.57
nodes
Mean (SD)
4.9 (5.6)
5.1 (5.3)
0.665
0.03
size
Mean (SD)
29.6 (14.4)
28.8 (14.1)
0.470
0.06
er
Mean (SD)
79.7 (124.2)
125.8 (191.1)
<0.001
0.30
pgr
Mean (SD)
102.0 (170.0)
124.3 (249.7)
0.213
0.11
grade
0.273
0.06
1
48 (10.9%)
33 (13.4%)
2
281 (63.9%)
163 (66.3%)
3
111 (25.2%)
50 (20.3%)
meno
<0.001
0.27
0
231 (52.5%)
59 (24.0%)
1
209 (47.5%)
187 (76.0%)
Kaplan-Meier Analysis (ITT Population)
Show code
# Prepare counting process data for KM plot
dfcount <- df_counting (
df = df.analysis,
by.risk = 6 ,
tte.name = outcome.name,
event.name = event.name,
treat.name = treat.name
)
# Plot with confidence intervals and log-rank test
plot_weighted_km (
dfcount,
conf.int = TRUE ,
show.logrank = TRUE ,
ymax = 1.05 ,
xmed.fraction = 0.775 ,
ymed.offset = 0.125
)
The ITT Cox hazard ratio estimate is approximately 0.69 (95% CI: 0.54, 0.89), suggesting an overall benefit for hormonal therapy.
Preliminary Analysis: Generalized Random Forests
Before running ForestSearch, we can use GRF to explore potential treatment effect heterogeneity and identify candidate factors.
Show code
t0 <- proc.time ()
grf_est <- grf.subg.harm.survival (
data = df.analysis,
confounders.name = confounders.name,
outcome.name = outcome.name,
event.name = event.name,
id.name = id.name,
treat.name = treat.name,
maxdepth = 2 ,
n.min = 60 ,
dmin.grf = 12 ,
frac.tau = 0.6 ,
details = TRUE
)
tau, maxdepth = 46.75811 2
leaf.node control.mean control.size control.se depth
1 2 6.49 82.00 3.34 1
2 3 -4.10 604.00 1.06 1
11 4 -7.90 112.00 2.81 2
21 5 3.86 177.00 1.87 2
4 7 -5.89 356.00 1.33 2
Selected subgroup:
leaf.node control.mean control.size control.se depth
1 2 6.49 82.00 3.34 1
GRF subgroup found
Terminating node at max.diff (sg.harm.id):
[1] "er <= 0"
All splits:
[1] "er <= 0" "age <= 50" "age <= 43"
Show code
timings$ grf <- (proc.time () - t0)["elapsed" ]
Show code
# Display policy trees
# leaf1 = recommend control, leaf2 = recommend treatment
par (mfrow = c (1 , 2 ))
plot (grf_est$ tree1, leaf.labels = c ("Control" , "Treat" ), main = "Depth 1" )
Show code
plot (grf_est$ tree2, leaf.labels = c ("Control" , "Treat" ), main = "Depth 2" )
Show code
GRF identifies estrogen receptor status (ER) as a key factor, with ER ≤ 0 suggesting potential harm from hormonal therapy.
ForestSearch Analysis
Running ForestSearch
ForestSearch performs an exhaustive search over candidate subgroup combinations with up to maxk factors. Key parameters:
hr.threshold
1.25
Minimum HR for consistency evaluation
hr.consistency
1.0
Minimum consistency rate for candidates
pconsistency.threshold
0.90
Required consistency for selection
maxk
2
Maximum factors in subgroup definition
n.min
60
Minimum subgroup sample size
d0.min, d1.min
12
Minimum events per treatment arm
Show code
t0 <- proc.time ()
fs <- forestsearch (
df.analysis,
confounders.name = confounders.name,
outcome.name = outcome.name,
treat.name = treat.name,
event.name = event.name,
id.name = id.name,
# Threshold parameters (per León et al. 2024)
hr.threshold = 1.25 ,
hr.consistency = 1.0 ,
pconsistency.threshold = 0.90 ,
stop_threshold = 0.95 ,
# Search configuration
sg_focus = "hr" ,
max_subgroups_search = 10 ,
use_twostage = TRUE ,
# Factor selection
use_grf = TRUE ,
use_lasso = TRUE ,
cut_type = "default" ,
# Subgroup constraints
maxk = 2 ,
n.min = 60 ,
d0.min = 12 ,
d1.min = 12 ,
# Consistency evaluation
fs.splits = 100 ,
# Parallel processing
parallel_args = list (
plan = "multisession" ,
workers = n_cores,
show_message = TRUE
),
# Output options
showten_subgroups = TRUE ,
details = TRUE ,
plot.sg = TRUE
)
=== Two-Stage Consistency Evaluation Enabled ===
Stage 1 screening splits: 30
Maximum total splits: 100
Batch size: 20
================================================
GRF stage for cut selection with dmin, tau = 12 0.6
tau, maxdepth = 46.75811 2
leaf.node control.mean control.size control.se depth
1 2 6.49 82.00 3.34 1
2 3 -4.10 604.00 1.06 1
11 4 -7.90 112.00 2.81 2
21 5 3.86 177.00 1.87 2
4 7 -5.89 356.00 1.33 2
Selected subgroup:
leaf.node control.mean control.size control.se depth
1 2 6.49 82.00 3.34 1
GRF subgroup found
Terminating node at max.diff (sg.harm.id):
[1] "er <= 0"
All splits:
[1] "er <= 0" "age <= 50" "age <= 43"
GRF cuts identified: 3
Cuts: er <= 0, age <= 50, age <= 43
# of continuous/categorical characteristics 5 2
Continuous characteristics: age size nodes pgr er
Categorical characteristics: meno grade3
## Prior to lasso: age size nodes pgr er
#### Lasso selection results
7 x 1 sparse Matrix of class "dgCMatrix"
s0
age .
meno .
size 0.005433435
grade3 0.178139021
nodes 0.049670523
pgr -0.001812895
er .
Cox-LASSO selected: size grade3 nodes pgr
Cox-LASSO not selected: age meno er
### End Lasso selection
## After lasso: size nodes pgr
Default cuts included from Lasso: size <= mean(size) size <= median(size) size <= qlow(size) size <= qhigh(size) nodes <= mean(nodes) nodes <= median(nodes) nodes <= qlow(nodes) nodes <= qhigh(nodes) pgr <= mean(pgr) pgr <= median(pgr) pgr <= qlow(pgr) pgr <= qhigh(pgr)
Categorical after Lasso: grade3
Factors per GRF: er <= 0 age <= 50 age <= 43
Initial GRF cuts included er <= 0 age <= 50 age <= 43
Factors included per GRF (not in lasso) er <= 0 age <= 50 age <= 43
===== CONSOLIDATED CUT EVALUATION (IMPROVED) =====
Evaluating 16 cut expressions once and caching...
Cut evaluation summary:
Total cuts: 16
Valid cuts: 16
Errors: 0
✓ All 16 factors validated as 0/1
===== END CONSOLIDATED CUT EVALUATION =====
# of candidate subgroup factors= 16
[1] "er <= 0" "age <= 50" "age <= 43" "size <= 29.3" "size <= 25"
[6] "size <= 20" "size <= 35" "nodes <= 5" "nodes <= 3" "nodes <= 1"
[11] "nodes <= 7" "pgr <= 110" "pgr <= 32.5" "pgr <= 7" "pgr <= 131.8"
[16] "grade3"
Number of possible configurations (<= maxk): maxk = 2 , # combinations = 528
Events criteria: control >= 12 , treatment >= 12
Subgroup search completed in 0.01 minutes
Found 12 subgroup candidate(s)
# of candidate subgroups (meeting all criteria) = 12
Random seed set to: 8316951
Two-stage parameters:
n.splits.screen: 30
screen.threshold: 0.763
batch.size: 20
conf.level: 0.95
# of unique initial candidates: 12
# Restricting to top stop_Kgroups = 10
# of candidates to evaluate: 10
# Early stop threshold: 0.95
================================================================================
TOP 10 CANDIDATE SUBGROUPS FOR CONSISTENCY EVALUATION
Sorted by: hr
================================================================================
Rank HR N Events K Subgroup Definition
--------------------------------------------------------------------------------
1 2.537 61 34 2 er <= 0 & size <= 35
2 2.222 75 41 2 er <= 0 & pgr <= 32.5
3 2.164 68 38 2 er <= 0 & NOT(age <= 43)
4 2.054 61 35 2 er <= 0 & NOT(size <= 20)
5 1.992 64 34 2 er <= 0 & pgr <= 7
6 1.951 82 45 1 er <= 0
7 1.710 72 39 2 grade3 & pgr <= 7
8 1.707 71 36 2 age <= 50 & pgr <= 7
9 1.530 177 55 2 age <= 50 & NOT(age <= 43)
10 1.397 142 72 2 age <= 50 & pgr <= 32.5
--------------------------------------------------------------------------------
Parallel config: workers = 4 , batch_size = 1
Batch 1 / 10 : candidates 1 - 1
==================================================
EARLY STOP TRIGGERED (batch 1 )
Candidate: 1 of 10
Pcons: 0.99 >= 0.95
==================================================
Evaluated 1 of 10 candidates (early stop)
1 subgroups passed consistency threshold
*** Subgroup found: {er <= 0} {size <= 35}
% consistency criteria met= 0.99
SG focus = hr
Seconds and minutes forestsearch overall = 2.342 0.039
Consistency algorithm used: twostage
Show code
plan ("sequential" )
timings$ forestsearch <- (proc.time () - t0)["elapsed" ]
cat (" \n ForestSearch completed in" ,
round (timings$ forestsearch, 1 ), "seconds \n " )
ForestSearch completed in 2.3 seconds
ForestSearch Results
Identified Subgroups
Show code
# Generate results tables
res_tabs <- sg_tables (fs, ndecimals = 3 , which_df = "est" )
# Display top subgroups meeting criteria
res_tabs$ sg10_out
Two-factor subgroups (maxk=2)
{er <= 0}
{size <= 35}
61
34
15
2.537
0.990
Search Configuration: Single-factor candidates (L) = 32; Maximum combinations evaluated = 528; Search depth (maxk) = 2
Search Results: Candidate subgroups found = 12; Maximum HR estimate = 2.54
Note: E1 = events in treatment arm; Pcons = consistency proportion
Treatment Effect Estimates
Show code
# ITT and subgroup estimates
res_tabs$ tab_estimates
Training data estimates
ITT
686 (100.0%)
246 (35.9%)
299 (43.6%)
66.3
50.2
7.8
0.69 (0.54, 0.89)
Questionable
61 (8.9%)
23 (37.7%)
34 (55.7%)
18.5
48
-19
2.54 (1.25, 5.17)
Recommend
625 (91.1%)
223 (35.7%)
265 (42.4%)
66.7
52.2
9.6
0.61 (0.47, 0.79)
Identified Subgroup Definition
Show code
cat ("Identified subgroup (H):" , paste (fs$ sg.harm, collapse = " & " ), " \n " )
Identified subgroup (H): {er <= 0} & {size <= 35}
Show code
cat ("Subgroup size:" , sum (fs$ df.est$ treat.recommend == 0 ),
sprintf ("(%.1f%% of ITT) \n " ,
100 * mean (fs$ df.est$ treat.recommend == 0 )))
Subgroup size: 61 (8.9% of ITT)
ForestSearch identifies Estrogen ≤ 0 (ER-negative) as the subgroup with potential harm. This is biologically plausible: tamoxifen is a selective estrogen receptor modulator with limited efficacy in ER-negative tumors.
Bootstrap Bias Correction
Rationale
Cox model estimates from identified subgroups are upwardly biased due to the selection process (subgroups are selected because they show extreme effects). Bootstrap bias correction addresses this by:
Resampling with replacement
Re-running the entire ForestSearch algorithm
Computing bias terms from bootstrap vs. observed estimates
Applying infinitesimal jackknife variance estimation
Running Bootstrap Analysis
Show code
# Number of bootstrap iterations
# Use 500-2000 for production; reduced here for vignette
NB <- 3
t0 <- proc.time ()
fs_bc <- forestsearch_bootstrap_dofuture (
fs.est = fs,
nb_boots = NB,
show_three = FALSE ,
details = TRUE
)
Ystar matrix generated should be 'boots x N': 3 x 686
ForestSearch parameters for bootstrap iterations:
- sg_focus: hr
- maxk: 2
- fs.splits: 100
- max_subgroups_search: 10
- hr.threshold: 1.25
- hr.consistency: 1
- pconsistency.threshold: 0.9
- n.min: 60
- use_twostage: TRUE
- use_lasso: TRUE
- use_grf: TRUE
Bootstrap-specific overrides:
- grf_res: NULL (forces re-selection)
- grf_cuts: NULL (forces re-selection)
- parallel_args: sequential (prevents nested parallelism)
- details: FALSE (suppressed in workers)
- plot.sg: FALSE
- plot.grf: FALSE
=== Bootstrap Analysis Complete ===
Success rate: 100.0% (3/3)
H (Questionable) Estimates:
Unadjusted: 2.54 (1.25,5.17)
Bias-corrected: 2.08 (0.33,13.07)
Hc (Recommend) Estimates:
Unadjusted: 0.61 (0.47,0.79)
Bias-corrected: 0.58 (0.16,2.02)
===================================
Show code
plan ("sequential" )
timings$ bootstrap <- (proc.time () - t0)["elapsed" ]
cat (" \n Bootstrap completed in" ,
round (timings$ bootstrap / 60 , 1 ), "minutes \n " )
Bootstrap completed in 0.1 minutes
Bootstrap Summary and Diagnostics
Show code
# Comprehensive summary with diagnostics
summaries <- summarize_bootstrap_results (
sgharm = fs$ sg.harm,
boot_results = fs_bc,
create_plots = TRUE ,
est.scale = "hr"
)
===============================================================
BOOTSTRAP ANALYSIS SUMMARY
===============================================================
IDENTIFIED SUBGROUP:
-------------------------------------------------------------
H: {er <= 0} & {size <= 35}
BOOTSTRAP SUCCESS METRICS:
-------------------------------------------------------------
Total iterations: 3
Successful subgroup ID: 3 (100.0%)
Failed to find subgroup: 0 (0.0%)
TIMING ANALYSIS:
-------------------------------------------------------------
Overall:
Total bootstrap time: 0.07 minutes (0.00 hours)
Average per iteration: 0.02 min (1.3 sec)
Projected for 1000 boots: 21.93 min (0.37 hrs)
Show code
# Display bias-corrected estimates table
summaries$ table
Bootstrap bias-corrected estimates (3 iterations)
N
NT
Events
MedT
MedC
RMSTd
HR (95% CI)†
HR‡ (95% CI)
Qstnbl
61 (8.9%)
23 (37.7%)
34 (55.7%)
18.5
48
-19
2.54 (1.25, 5.17)
2.08 (0.33,13.07)
Recmnd
625 (91.1%)
223 (35.7%)
265 (42.4%)
66.7
52.2
9.6
0.61 (0.47, 0.79)
0.58 (0.16,2.02)
Note : Med = Median survival time (months). RMSTd = Restricted mean survival time difference. Subgroup identified in 100.0% of bootstrap samples.
Event Count Summary
Low event counts can lead to unstable HR estimates. This summary helps identify potential issues:
Show code
event_summary <- summarize_bootstrap_events (fs_bc, threshold = 12 )
=== Bootstrap Event Count Summary ===
Total bootstrap iterations: 3
Event threshold: <12 events
ORIGINAL Subgroup H on BOOTSTRAP samples:
Control arm <12 events: 0 (0.0%)
Treatment arm <12 events: 0 (0.0%)
Either arm <12 events: 0 (0.0%)
ORIGINAL Subgroup Hc on BOOTSTRAP samples:
Control arm <12 events: 0 (0.0%)
Treatment arm <12 events: 0 (0.0%)
Either arm <12 events: 0 (0.0%)
NEW Subgroups found: 3 (100.0%)
NEW Subgroup H* on ORIGINAL data:
Control arm <12 events: 0 (0.0% of successful)
Treatment arm <12 events: 0 (0.0% of successful)
Either arm <12 events: 0 (0.0% of successful)
NEW Subgroup Hc* on ORIGINAL data:
Control arm <12 events: 0 (0.0% of successful)
Treatment arm <12 events: 0 (0.0% of successful)
Either arm <12 events: 0 (0.0% of successful)
Bootstrap Diagnostics
Show code
# Quality metrics
summaries$ diagnostics_table_gt
Analysis of 3 bootstrap iterations
Success Rate
Total iterations
3
Successful
3 (100.0%)
Failed
0 (0.0%)
Success rating
Excellent
Subgroup H (Questionable)
Observed HR
2.537
Bias-corrected HR
2.077
Bootstrap CV (%)
44.5%
N estimates
3
Subgroup Hc (Recommend)
Observed HR
0.608
Bias-corrected HR
0.575
Bootstrap CV (%)
42.8%
N estimates
3
Subgroup Agreement
How consistently does bootstrap identify the same subgroup?
Show code
# Agreement with original analysis
if (! is.null (summaries$ subgroup_summary$ original_agreement)) {
summaries$ subgroup_summary$ original_agreement
}
Metric Value
<char> <char>
1: Total bootstrap iterations 3
2: Successful iterations 3
3: Failed iterations (no subgroup) 0
4:
5: Original subgroup definition {er <= 0} & {size <= 35}
6: Exact match with original 0 (0.0%)
7: Different from original 3 (100.0%)
8: Partial match (shared factor) 1 (33.3%)
Show code
# Factor presence across bootstrap iterations
if (! is.null (summaries$ subgroup_summary$ factor_presence)) {
summaries$ subgroup_summary$ factor_presence
}
Rank Factor Count Percent
4 1 pgr 2 66.66667
1 2 age 1 33.33333
2 3 er 1 33.33333
3 4 grade3 1 33.33333
5 5 size 1 33.33333
Bootstrap Distributions
Show code
if (! is.null (summaries$ plots)) {
summaries$ plots$ H_distribution + summaries$ plots$ Hc_distribution
}
Cross-Validation
Cross-validation assesses the stability of the ForestSearch algorithm. Two approaches are available:
K-Fold Cross-Validation
Show code
# 10-fold CV with multiple iterations
# Use Ksims >= 50 for production
Ksims <- 10
t0 <- proc.time ()
fs_kfold <- forestsearch_tenfold (
fs.est = fs,
sims = Ksims,
Kfolds = 10 ,
details = FALSE ,
parallel_args = list (
plan = "multisession" ,
workers = n_cores,
show_message = TRUE
)
)
plan ("sequential" )
timings$ kfold <- (proc.time () - t0)["elapsed" ]
# Summary metrics
print (fs_kfold$ find_summary)
print (fs_kfold$ sens_summary)
Out-of-Bag (N-Fold) Cross-Validation
N-fold CV (leave-one-out) provides the most rigorous stability assessment:
Show code
t0 <- proc.time ()
fs_OOB <- forestsearch_Kfold (
fs.est = fs,
details = TRUE ,
Kfolds = nrow (df.analysis), # N-fold = leave-one-out
parallel_args = list (
plan = "callr" ,
workers = n_cores,
show_message = TRUE
)
)
plan ("sequential" )
timings$ oob <- (proc.time () - t0)["elapsed" ]
# Summarize OOB results
summary_OOB <- forestsearch_KfoldOut (
res = fs_OOB,
details = TRUE ,
outall = TRUE
)
# Subgroup definitions found
table (summary_OOB$ SGs_found[, 1 ])
Results Visualization
Forest Plot
The forest plot summarizes treatment effects across the ITT population, reference subgroups, and identified subgroups with cross-validation metrics.
Show code
# Define reference subgroups for comparison
subgroups <- list (
age_gt65 = list (
subset_expr = "age > 65" ,
name = "Age > 65" ,
type = "reference"
),
age_le65 = list (
subset_expr = "age <= 65" ,
name = "Age ≤ 65" ,
type = "reference"
),
pgr_positive = list (
subset_expr = "pgr > 0" ,
name = "PgR > 0" ,
type = "reference"
),
pgr_negative = list (
subset_expr = "pgr <= 0" ,
name = "PgR ≤ 0" ,
type = "reference"
)
)
# Create forest plot
# Include fs_kfold and fs_OOB if available for CV metrics
result <- plot_subgroup_results_forestplot (
fs_results = list (
fs.est = fs,
fs_bc = fs_bc,
fs_OOB = NULL ,
fs_kfold = NULL
),
df_analysis = df.analysis,
subgroup_list = subgroups,
outcome.name = outcome.name,
event.name = event.name,
treat.name = treat.name,
E.name = "Hormonal" ,
C.name = "Chemo" ,
ci_column_spaces = 25 ,
xlog = TRUE
)
# Display
plot (result$ plot)
Kaplan-Meier by Identified Subgroups
Show code
# Create KM plots for identified subgroups
# Alternatively, use weightedsurv package as illustrated above;
# And also in the forestsearch results (above) where the subgroup figures are also displayed
par (mfrow = c (1 , 2 ))
# H subgroup (potential harm)
df_H <- subset (fs$ df.est, treat.recommend == 0 )
if (nrow (df_H) > 20 ) {
fit_H <- survfit (
as.formula (paste0 ("Surv(" , outcome.name, ", " , event.name, ") ~ " , treat.name)),
data = df_H
)
plot (fit_H, col = c ("blue" , "red" ), lwd = 2 ,
xlab = "Time (months)" , ylab = "Survival Probability" ,
main = paste ("H:" , paste (fs$ sg.harm, collapse = " & " )))
legend ("bottomleft" , c ("Control" , "Treatment" ), col = c ("blue" , "red" ), lwd = 2 )
}
# Hc subgroup (complement)
df_Hc <- subset (fs$ df.est, treat.recommend == 1 )
if (nrow (df_Hc) > 20 ) {
fit_Hc <- survfit (
as.formula (paste0 ("Surv(" , outcome.name, ", " , event.name, ") ~ " , treat.name)),
data = df_Hc
)
plot (fit_Hc, col = c ("blue" , "red" ), lwd = 2 ,
xlab = "Time (months)" , ylab = "Survival Probability" ,
main = "Hc: Complement (ER > 0)" )
legend ("bottomleft" , c ("Control" , "Treatment" ), col = c ("blue" , "red" ), lwd = 2 )
}
Show code
Summary and Interpretation
Key Findings
Show code
# Extract key results
cat ("=" %>% rep (60 ) %>% paste (collapse = "" ), " \n " )
============================================================
Show code
cat ("FORESTSEARCH ANALYSIS SUMMARY \n " )
FORESTSEARCH ANALYSIS SUMMARY
Show code
cat ("=" %>% rep (60 ) %>% paste (collapse = "" ), " \n\n " )
============================================================
Show code
cat ("Dataset: GBSG (N =" , nrow (df.analysis), ") \n " )
Show code
cat ("Outcome: Recurrence-free survival \n\n " )
Outcome: Recurrence-free survival
Show code
Show code
cat (" HR (95% CI): 0.69 (0.54, 0.89) \n\n " )
HR (95% CI): 0.69 (0.54, 0.89)
Show code
cat ("Identified Subgroup (H): \n " )
Show code
cat (" Definition:" , paste (fs$ sg.harm, collapse = " & " ), " \n " )
Definition: {er <= 0} & {size <= 35}
Show code
cat (" Size:" , sum (fs$ df.est$ treat.recommend == 0 ),
sprintf ("(%.1f%%) \n " , 100 * mean (fs$ df.est$ treat.recommend == 0 )))
Show code
cat (" Unadjusted HR:" , sprintf ("%.2f" , exp (fs$ grp.consistency$ out_sg$ result$ hr[1 ])), " \n " )
Show code
cat (" \n Complement Subgroup (Hc): \n " )
Complement Subgroup (Hc):
Show code
cat (" Size:" , sum (fs$ df.est$ treat.recommend == 1 ),
sprintf ("(%.1f%%) \n " , 100 * mean (fs$ df.est$ treat.recommend == 1 )))
Clinical Interpretation
The ForestSearch analysis identifies estrogen receptor-negative (ER ≤ 0) patients as a subgroup with potential lack of benefit from hormonal therapy.
Biological plausibility : Tamoxifen is a selective estrogen receptor modulator. Its efficacy depends on ER expression. The finding that ER-negative patients may not benefit is consistent with:
Mechanistic understanding of tamoxifen action
Meta-analyses showing no tamoxifen benefit in ER-negative breast cancer
Clinical guidelines recommending tamoxifen primarily for ER-positive tumors
Caveats :
This is an exploratory analysis requiring independent validation
The bias-corrected estimates have wider confidence intervals
Cross-validation metrics should be evaluated for algorithm stability
Computational Timing
Show code
timings$ total <- (proc.time () - t_vignette_start)["elapsed" ]
timing_df <- data.frame (
Analysis = c ("GRF" , "ForestSearch" , "Bootstrap" , "Total" ),
Seconds = c (
timings$ grf,
timings$ forestsearch,
timings$ bootstrap,
timings$ total
)
)
timing_df$ Minutes <- timing_df$ Seconds / 60
gt (timing_df) |>
tab_header (title = "Computational Timing" ) |>
fmt_number (columns = c (Seconds, Minutes), decimals = 1 ) |>
cols_label (
Analysis = "Component" ,
Seconds = "Time (sec)" ,
Minutes = "Time (min)"
)
Component
Time (sec)
Time (min)
GRF
0.2
0.0
ForestSearch
2.3
0.0
Bootstrap
5.0
0.1
Total
9.1
0.2
References
León LF, Jemielita T, Guo Z, Marceau West R, Anderson KM (2024). “Exploratory subgroup identification in the heterogeneous Cox model: A relatively simple procedure.” Statistics in Medicine . DOI: 10.1002/sim.10163